Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Code clone detection based on dependency enhanced hierarchical abstract syntax tree
Zexuan WAN, Chunli XIE, Quanrun LYU, Yao LIANG
Journal of Computer Applications    2024, 44 (4): 1259-1268.   DOI: 10.11772/j.issn.1001-9081.2023040485
Abstract106)   HTML1)    PDF (1734KB)(104)       Save

In the field of software engineering, code clone detection methods based on semantic similarity can reduce the cost of software maintenance and prevent system vulnerabilities. As a typical form of code abstract representation, Abstract Syntax Tree (AST) has achieved success in code clone detection tasks of many program languages. However, the existing work mainly uses the original AST to extract code semantics, and does not dig deep semantic and structural information in AST. To solve the above problem, a code clone detection method based on Dependency Enhanced Hierarchical Abstract Syntax Tree (DEHAST) was proposed. Firstly, the AST was layered and divided into different semantic levels. Secondly, corresponding dependency enhancement edges were added to different levels of AST to construct DEHAST, thus a simple AST was transformed into a heterogeneous graph with richer program semantics. Finally, a Graph Matching Network (GMN) model was used to detect the similarity of heterogeneous graphs to achieve code clone detection. Experimental results on two datasets BigCloneBench and Google Code Jam show that DEHAST is able to detect 100% of Type-1 and Type-2 code clones, 99% of Type-3 code clones, and 97% of Type-4 code clones; compared with the tree based method ASTNN (AST-based Neural Network), the F1 values all increase by 4 percentage points. Therefore, DEHAST can effectively perform code semantic clone detection.

Table and Figures | Reference | Related Articles | Metrics